Skip to content

Fix llm_id validation for image2text Gemini usage.#14320

Open
lokinhh wants to merge 5 commits intoinfiniflow:mainfrom
lokinhh:fix/accept-image2text-llm-id
Open

Fix llm_id validation for image2text Gemini usage.#14320
lokinhh wants to merge 5 commits intoinfiniflow:mainfrom
lokinhh:fix/accept-image2text-llm-id

Conversation

@lokinhh
Copy link
Copy Markdown

@lokinhh lokinhh commented Apr 23, 2026

Allow chat assistant llm_id validation to accept image2text models and resolve chat model config using actual llm type so Gemini vision models no longer fail with misleading doesn't-exist errors.

Made-with: Cursor

What problem does this PR solve?

RAGFlow currently validates chat assistant llm_id too strictly against model_type=chat, causing code 102 ("llm_id ... doesn't exist") when users configure Gemini as image2text. This PR allows llm_id validation to accept both chat and image2text, and resolves runtime model config based on the actual llm type, so Gemini vision models can be used without misleading validation errors.

Type of change

  • Bug Fix (non-breaking change which fixes an issue)

Allow chat assistant llm_id validation to accept image2text models and resolve chat model config using actual llm type so Gemini vision models no longer fail with misleading doesn't-exist errors.

Made-with: Cursor
@dosubot dosubot Bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Apr 23, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 23, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

LLM ID validation and dialog model-resolution now try multiple candidate model types (chat, image2text) with ordered fallbacks; dialog model selection logic is centralized into a helper returning both resolved model_config and normalized llm_type, and callers were updated to use it.

Changes

Cohort / File(s) Summary
Model Type Validation
api/apps/restful_apis/chat_api.py
_validate_llm_id now builds candidate model_type list (use the provided when it's chat or image2text, otherwise try both) and returns success if any lookup matches; signature and return semantics unchanged.
Dialog Model Resolution & Multimodal Normalization
api/db/services/dialog_service.py
Added _get_dialog_chat_model_config(dialog) to centralize model-config lookup and normalized llm_type derivation; callers (async_chat_solo, async_chat, get_models) refactored to consume the helper. get_models signature updated to get_models(dialog, llm_model_config=None, llm_type=None). Gemini multimodal inputs changed to OpenAI-compatible content blocks with text and image_url (base64 data:image/png;base64,...).

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client
  participant DialogSvc as DialogService\n(_get_dialog_chat_model_config)
  participant ModelRepo as ModelConfigRepo
  participant TenantSvc as TenantService

  rect rgba(200,200,255,0.5)
    Client->>DialogSvc: request chat (dialog with llm_id?)
    DialogSvc->>ModelRepo: query model_config by (provided llm_id, provided model_type)
    alt found
      ModelRepo-->>DialogSvc: model_config
    else not found
      DialogSvc->>ModelRepo: query model_config by (llm_id, alternate model_type)
      alt found
        ModelRepo-->>DialogSvc: model_config
      else not found
        DialogSvc->>TenantSvc: resolve tenant_llm_id / tenant default
        TenantSvc-->>DialogSvc: tenant candidate llm_type(s)/ids
        DialogSvc->>ModelRepo: query model_config by tenant id/type
        ModelRepo-->>DialogSvc: model_config or none
      end
    end
    alt model_config found
      DialogSvc-->>Client: resolved model_config + llm_type
    else none
      DialogSvc-->>Client: error (not found)
    end
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

🐞 bug, 🐖api

Suggested reviewers

  • 6ba3i

Poem

🐇 I hop through code with eager feet,

I check two paths so resolutions meet,
Chat or image — I seek what's true,
Fetch the config, return it to you,
Little rabbit dances, proud and neat.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately captures the main fix: allowing llm_id validation to accept image2text models for Gemini usage.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description includes all required template sections with clear problem statement, type of change selection, and comprehensive context about the fix.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
api/db/services/dialog_service.py (1)

256-306: Recommended: reuse the helper (or at least its type resolution) in async_chat too.

Lines 501–505 of async_chat (outside this diff) still open-code the same llm_type == "image2text" dispatch that the new helper centralizes. Leaving that copy in place defeats much of the DRY benefit of introducing _get_dialog_chat_model_config and will silently diverge if fallback/priority rules change here. Consider either calling the helper in async_chat or factoring out a small _resolve_dialog_model_type(dialog) that both call sites share. Not blocking for this PR since that region is untouched, but worth a follow-up.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/db/services/dialog_service.py` around lines 256 - 306, The async_chat
implementation duplicates the llm_type resolution logic that
_get_dialog_chat_model_config already centralizes; update async_chat to call
_get_dialog_chat_model_config(dialog) (or extract a shared
_resolve_dialog_model_type(dialog) used by both) and use its returned
model_config/llm_type instead of re-implementing the "image2text" dispatch so
fallback/priority rules stay in one place and avoid silent divergence between
async_chat and get_models.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@api/db/services/dialog_service.py`:
- Around line 256-306: The async_chat implementation duplicates the llm_type
resolution logic that _get_dialog_chat_model_config already centralizes; update
async_chat to call _get_dialog_chat_model_config(dialog) (or extract a shared
_resolve_dialog_model_type(dialog) used by both) and use its returned
model_config/llm_type instead of re-implementing the "image2text" dispatch so
fallback/priority rules stay in one place and avoid silent divergence between
async_chat and get_models.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: eab15055-956d-4655-9b45-5ebfe49e6f48

📥 Commits

Reviewing files that changed from the base of the PR and between 8901c18 and b60ebb6.

📒 Files selected for processing (2)
  • api/apps/restful_apis/chat_api.py
  • api/db/services/dialog_service.py

lokinh added 2 commits April 23, 2026 13:08
Retry both chat and image2text model types when resolving dialog llm_id to prevent false lookup failures when type detection and tenant model records are temporarily inconsistent.

Made-with: Cursor
Reuse a single resolver for model config and llm type in async_chat paths so chat/image2text fallback rules stay consistent across runtime call sites.

Made-with: Cursor
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/db/services/dialog_service.py`:
- Around line 233-251: _get_dialog_chat_model_config currently returns only a
config while callers (e.g., async_chat and places around lines 266/316/511)
still re-derive a model type via TenantLLMService.llm_id2llm_type, causing
mismatches when the fallback chooses a different type; change
_get_dialog_chat_model_config to return a tuple (resolved_type, config), update
all call sites to use the returned resolved_type before any attachment/image
handling or request-shape decisions (remove direct calls to
TenantLLMService.llm_id2llm_type where you now have the resolved_type), ensure
async_chat uses this helper instead of its own single-type lookup, and add
logging inside the helper to record fallback attempts and the final
resolved_type so resolution choices are auditable.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b12aed35-1d49-4310-8fc1-ff97625cbdcb

📥 Commits

Reviewing files that changed from the base of the PR and between b60ebb6 and 7314dd0.

📒 Files selected for processing (1)
  • api/db/services/dialog_service.py

Comment thread api/db/services/dialog_service.py Outdated
Return resolved model type together with model config and reuse it across async chat paths so attachment handling and request shape stay consistent with fallback selection.

Made-with: Cursor
@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Apr 23, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
api/db/services/dialog_service.py (1)

518-540: Duplicate model-config resolution per chat request.

_resolve_dialog_chat_model(dialog) at Line 518 fully resolves llm_model_config, then get_models(dialog) at Line 540 calls _get_dialog_chat_model_config(dialog) again at Line 323 to build chat_mdl. That's two DB round-trips (plus the llm_id2llm_type lookup inside each) for the same dialog on every request. Consider threading the already-resolved config/type into get_models to avoid the re-lookup.

♻️ Proposed direction
-def get_models(dialog):
+def get_models(dialog, chat_model_config=None):
     embd_mdl, chat_mdl, rerank_mdl, tts_mdl = None, None, None, None
     ...
-    chat_model_config = _get_dialog_chat_model_config(dialog)
+    if chat_model_config is None:
+        chat_model_config = _get_dialog_chat_model_config(dialog)
     chat_mdl = LLMBundle(dialog.tenant_id, chat_model_config)

And at the async_chat call site:

-    kbs, embd_mdl, rerank_mdl, chat_mdl, tts_mdl = get_models(dialog)
+    kbs, embd_mdl, rerank_mdl, chat_mdl, tts_mdl = get_models(dialog, chat_model_config=llm_model_config)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/db/services/dialog_service.py` around lines 518 - 540, The code currently
calls _resolve_dialog_chat_model(dialog) to get llm_model_config and llm_type,
then calls get_models(dialog) which internally calls
_get_dialog_chat_model_config(dialog) again causing duplicate DB lookups; change
get_models to accept the already-resolved llm_model_config and llm_type (e.g.,
get_models(dialog, llm_model_config=None, llm_type=None)), update get_models
implementation to use the passed-in values instead of calling
_get_dialog_chat_model_config, and update all call sites (including async_chat
callers) to pass the resolved config/type so the second DB round-trip and
duplicate llm_id2llm_type lookup are eliminated.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@api/db/services/dialog_service.py`:
- Around line 518-540: The code currently calls
_resolve_dialog_chat_model(dialog) to get llm_model_config and llm_type, then
calls get_models(dialog) which internally calls
_get_dialog_chat_model_config(dialog) again causing duplicate DB lookups; change
get_models to accept the already-resolved llm_model_config and llm_type (e.g.,
get_models(dialog, llm_model_config=None, llm_type=None)), update get_models
implementation to use the passed-in values instead of calling
_get_dialog_chat_model_config, and update all call sites (including async_chat
callers) to pass the resolved config/type so the second DB round-trip and
duplicate llm_id2llm_type lookup are eliminated.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8de8361c-eee8-4851-86cf-689a1cfd6259

📥 Commits

Reviewing files that changed from the base of the PR and between 7314dd0 and c966c2e.

📒 Files selected for processing (1)
  • api/db/services/dialog_service.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
api/db/services/dialog_service.py (1)

554-576: ⚠️ Potential issue | 🟡 Minor

Avoid resolving the chat model twice in the same request.

Line 554 resolves llm_model_config/llm_type, but Line 576 calls get_models(dialog), which resolves the chat config again via _get_dialog_chat_model_config(). If model records change mid-request or duplicate model-type records resolve differently, llm_type, max_tokens, Langfuse metadata, and the actual chat_mdl can diverge.

Consider passing the already-resolved chat config into get_models() or returning the resolved config/type from get_models() so the request uses one resolution result end-to-end.

🐛 Proposed direction
-def get_models(dialog):
+def get_models(dialog, chat_model_config=None):
@@
-    chat_model_config, _ = _get_dialog_chat_model_config(dialog)
+    if chat_model_config is None:
+        chat_model_config, _ = _get_dialog_chat_model_config(dialog)

     chat_mdl = LLMBundle(dialog.tenant_id, chat_model_config)
-    kbs, embd_mdl, rerank_mdl, chat_mdl, tts_mdl = get_models(dialog)
+    kbs, embd_mdl, rerank_mdl, chat_mdl, tts_mdl = get_models(
+        dialog,
+        chat_model_config=llm_model_config,
+    )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/db/services/dialog_service.py` around lines 554 - 576, The code currently
resolves the dialog chat config via _get_dialog_chat_model_config(dialog) into
llm_model_config/llm_type but then calls get_models(dialog) which calls
_get_dialog_chat_model_config again, risking divergent model resolution; update
get_models to accept the already-resolved config/type (e.g., change signature to
get_models(dialog, llm_model_config=None, llm_type=None) or similar) and use
those values inside get_models (or alternately have get_models return the
resolved llm_model_config and llm_type along with models), then change the call
site in dialog_service.py (the call that currently does kbs, embd_mdl,
rerank_mdl, chat_mdl, tts_mdl = get_models(dialog)) to pass the existing
llm_model_config/llm_type (or unpack the returned config) so the same resolution
(and derived values like max_tokens and Langfuse context) is used throughout the
request.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@api/db/services/dialog_service.py`:
- Around line 554-576: The code currently resolves the dialog chat config via
_get_dialog_chat_model_config(dialog) into llm_model_config/llm_type but then
calls get_models(dialog) which calls _get_dialog_chat_model_config again,
risking divergent model resolution; update get_models to accept the
already-resolved config/type (e.g., change signature to get_models(dialog,
llm_model_config=None, llm_type=None) or similar) and use those values inside
get_models (or alternately have get_models return the resolved llm_model_config
and llm_type along with models), then change the call site in dialog_service.py
(the call that currently does kbs, embd_mdl, rerank_mdl, chat_mdl, tts_mdl =
get_models(dialog)) to pass the existing llm_model_config/llm_type (or unpack
the returned config) so the same resolution (and derived values like max_tokens
and Langfuse context) is used throughout the request.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3be5d3d8-3444-456c-a2fc-ae86f0652b1c

📥 Commits

Reviewing files that changed from the base of the PR and between c966c2e and ba2647c.

📒 Files selected for processing (1)
  • api/db/services/dialog_service.py

Use OpenAI-compatible typed multimodal blocks for Gemini to avoid LiteLLM validation errors, and pass pre-resolved llm config/type into get_models to prevent duplicate resolution in async chat.

Made-with: Cursor
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
api/db/services/dialog_service.py (1)

345-362: Override semantics in get_models are all-or-nothing — consider decoupling.

The guard if llm_model_config is None or llm_type is None means that if a caller passes only one of the two (e.g., a resolved llm_model_config but forgets llm_type, or vice versa), the helper silently re-resolves both and discards the caller's explicit value. Today the only non-default caller (async_chat, Lines 581–583) passes both, so this is latent, but it's an easy foot-gun for future callers.

♻️ Proposed tweak
-    if llm_model_config is None or llm_type is None:
-        llm_model_config, llm_type = _get_dialog_chat_model_config(dialog)
+    if llm_model_config is None and llm_type is None:
+        llm_model_config, llm_type = _get_dialog_chat_model_config(dialog)
+    elif llm_model_config is None or llm_type is None:
+        raise ValueError("llm_model_config and llm_type must be provided together")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/db/services/dialog_service.py` around lines 345 - 362, The current guard
in get_models uses "if llm_model_config is None or llm_type is None" which
re-resolves and overwrites both values even when only one is missing; change
this to resolve defaults but only fill the missing pieces: call
_get_dialog_chat_model_config(dialog) when either is None, capture its
two-return values (e.g., resolved_config, resolved_type), then set
llm_model_config = resolved_config only if llm_model_config is None and set
llm_type = resolved_type only if llm_type is None, then proceed to create
chat_mdl = LLMBundle(dialog.tenant_id, llm_model_config).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@api/db/services/dialog_service.py`:
- Around line 345-362: The current guard in get_models uses "if llm_model_config
is None or llm_type is None" which re-resolves and overwrites both values even
when only one is missing; change this to resolve defaults but only fill the
missing pieces: call _get_dialog_chat_model_config(dialog) when either is None,
capture its two-return values (e.g., resolved_config, resolved_type), then set
llm_model_config = resolved_config only if llm_model_config is None and set
llm_type = resolved_type only if llm_type is None, then proceed to create
chat_mdl = LLMBundle(dialog.tenant_id, llm_model_config).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 958c8bbc-1f96-4e9f-bd49-5aba916aa0af

📥 Commits

Reviewing files that changed from the base of the PR and between ba2647c and ae26000.

📒 Files selected for processing (1)
  • api/db/services/dialog_service.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant